Dynamically Shaping the Reordering Search Space of Phrase-Based Statistical Machine Translation

نویسندگان

  • Arianna Bisazza
  • Marcello Federico
چکیده

Defining the reordering search space is a crucial issue in phrase-based SMT between distant languages. In fact, the optimal tradeoff between accuracy and complexity of decoding is nowadays reached by harshly limiting the input permutation space. We propose a method to dynamically shape such space and, thus, capture long-range word movements without hurting translation quality nor decoding time. The space defined by loose reordering constraints is dynamically pruned through a binary classifier that predicts whether a given input word should be translated right after another. The integration of this model into a phrase-based decoder improves a strong Arabic-English baseline already including state-of-the-art early distortion cost (Moore and Quirk, 2007) and hierarchical phrase orientation models (Galley and Manning, 2008). Significant improvements in the reordering of verbs are achieved by a system that is notably faster than the baseline, while BLEU and METEOR remain stable, or even increase, at a very high distortion limit.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Solutions for Word Reordering in German-English Phrase-Based Statistical Machine Translation

Despite being closely related languages, German and English are characterized by important word order differences. Longrange reordering of verbs, in particular, represents a real challenge for state-of-theart SMT systems and is one of the main reasons why translation quality is often so poor in this language pair. In this work, we review several solutions to improve the accuracy of German-Engli...

متن کامل

The Operation Sequence Model - Combining N-Gram-Based and Phrase-Based Statistical Machine Translation

In this article, we present a novel machine translation model, the Operation Sequence Model (OSM), that combines the benefits of phrase-based and N-gram-based SMT and remedies their drawbacks. The model represents the translation process as a linear sequence of operations. The sequence includes not only translation operations but also reordering operations. As in Ngram-based SMT, the model is: ...

متن کامل

A joint translation model with integrated reordering

This dissertation aims at combining the benefits and to remedy the flaws of the two popular frameworks in statistical machine translation, namely Phrasebased MT and N-gram-based MT. Phrase-based MT advanced the state-of-the art towards translating phrases3 than words. By memorizing phrases, phrasal MT, is able to learn local reorderings, and handling of other local dependencies such as insertio...

متن کامل

An n-gram-based statistical machine translation decoder

In this paper we describe MARIE, an Ngram-based statistical machine translation decoder. It is implemented using a beam search strategy, with distortion (or reordering) capabilities. The underlying translation model is based on an Ngram approach, extended to introduce reordering at the phrase level. The search graph structure is designed to perform very accurate comparisons, what allows for a h...

متن کامل

An Ngram-based Statistical Machi

In this paper we describe MARIE, an Ngram-based statistical machine translation decoder. It is implemented using a beam search strategy, with distortion (or reordering) capabilities. The underlying translation model is based on an Ngram approach, extended to introduce reordering at the phrase level. The search graph structure is designed to perform very accurate comparisons, what allows for a h...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • TACL

دوره 1  شماره 

صفحات  -

تاریخ انتشار 2013